15 research outputs found

    Inertial Stochastic PALM (iSPALM) and Applications in Machine Learning

    Full text link
    Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of stochastic algorithms. While algorithms based on stochastic gradient descent are still used in the majority of applications, recently also stochastic algorithms for minimizing nonsmooth and nonconvex functions were proposed. In this paper, we derive an inertial variant of a stochastic PALM algorithm with variance-reduced gradient estimator, called iSPALM, and prove linear convergence of the algorithm under certain assumptions. Our inertial approach can be seen as generalization of momentum methods widely used to speed up and stabilize optimization algorithms, in particular in machine learning, to nonsmooth problems. Numerical experiments for learning the weights of a so-called proximal neural network and the parameters of Student-t mixture models show that our new algorithm outperforms both stochastic PALM and its deterministic counterparts

    Variational models for color image correction inspired by visual perception and neuroscience

    Get PDF
    Reproducing the perception of a real-world scene on a display device is a very challenging task which requires the understanding of the camera processing pipeline, the display process, and the way the human visual system processes the light it captures. Mathematical models based on psychophysical and physiological laws on color vision, named Retinex, provide efficient tools to handle degradations produced during the camera processing pipeline like the reduction of the contrast. In particular, Batard and BertalmĂ­o [J Math. Imag. Vis. 60(6), 849-881 (2018)] described some psy-chophysical laws on brightness perception as covariant derivatives, included them into a variational model, and observed that the quality of the color image correction is correlated with the accuracy of the vision model it includes. Based on this observation, we postulate that this model can be improved by including more accurate data on vision with a special attention on visual neuro-science here. Then, inspired by the presence of neurons responding to different visual attributes in the area V1 of the visual cortex as orientation, color or movement, to name a few, and horizontal connections modeling the interactions between those neurons, we construct two variational models to process both local (edges, textures) and global (contrast) features. This is an improvement with respect to the model of Batard and BertalmĂ­o as the latter can not process local and global features independently and simultaneously. Finally, we conduct experiments on color images which corroborate the improvement provided by the new models

    Wasserstein Steepest Descent Flows of Discrepancies with Riesz Kernels

    Full text link
    The aim of this paper is twofold. Based on the geometric Wasserstein tangent space, we first introduce Wasserstein steepest descent flows. These are locally absolutely continuous curves in the Wasserstein space whose tangent vectors point into a steepest descent direction of a given functional. This allows the use of Euler forward schemes instead of Jordan--Kinderlehrer--Otto schemes. For λ\lambda-convex functionals, we show that Wasserstein steepest descent flows are an equivalent characterization of Wasserstein gradient flows. The second aim is to study Wasserstein flows of the maximum mean discrepancy with respect to certain Riesz kernels. The crucial part is hereby the treatment of the interaction energy. Although it is not λ\lambda-convex along generalized geodesics, we give analytic expressions for Wasserstein steepest descent flows of the interaction energy starting at Dirac measures. In contrast to smooth kernels, the particle may explode, i.e., a Dirac measure becomes a non-Dirac one. The computation of steepest descent flows amounts to finding equilibrium measures with external fields, which nicely links Wasserstein flows of interaction energies with potential theory. Finally, we provide numerical simulations of Wasserstein steepest descent flows of discrepancies

    Alternatives to the EM Algorithm for ML-Estimation of Location, Scatter Matrix and Degree of Freedom of the Student-tt Distribution

    Full text link
    In this paper, we consider maximum likelihood estimations of the degree of freedom parameter Îœ\nu, the location parameter ÎŒ\mu and the scatter matrix ÎŁ\Sigma of the multivariate Student-tt distribution. In particular, we are interested in estimating the degree of freedom parameter Îœ\nu that determines the tails of the corresponding probability density function and was rarely considered in detail in the literature so far. We prove that under certain assumptions a minimizer of the negative log-likelihood function exists, where we have to take special care of the case Μ→∞\nu \rightarrow \infty, for which the Student-tt distribution approaches the Gaussian distribution. As alternatives to the classical EM algorithm we propose three other algorithms which cannot be interpreted as EM algorithm. For fixed Îœ\nu, the first algorithm is an accelerated EM algorithm known from the literature. However, since we do not fix Îœ\nu, we cannot apply standard convergence results for the EM algorithm. The other two algorithms differ from this algorithm in the iteration step for Îœ\nu. We show how the objective function behaves for the different updates of Îœ\nu and prove for all three algorithms that it decreases in each iteration step. We compare the algorithms as well as some accelerated versions by numerical simulation and apply one of them for estimating the degree of freedom parameter in images corrupted by Student-tt noise

    Generative Sliced MMD Flows with Riesz Kernels

    Full text link
    Maximum mean discrepancy (MMD) flows suffer from high computational costs in large scale computations. In this paper, we show that MMD flows with Riesz kernels K(x,y)=−∄x−y∄rK(x,y) = - \|x-y\|^r, r∈(0,2)r \in (0,2) have exceptional properties which allow for their efficient computation. First, the MMD of Riesz kernels coincides with the MMD of their sliced version. As a consequence, the computation of gradients of MMDs can be performed in the one-dimensional setting. Here, for r=1r=1, a simple sorting algorithm can be applied to reduce the complexity from O(MN+N2)O(MN+N^2) to O((M+N)log⁥(M+N))O((M+N)\log(M+N)) for two empirical measures with MM and NN support points. For the implementations we approximate the gradient of the sliced MMD by using only a finite number PP of slices. We show that the resulting error has complexity O(d/P)O(\sqrt{d/P}), where dd is the data dimension. These results enable us to train generative models by approximating MMD gradient flows by neural networks even for large scale applications. We demonstrate the efficiency of our model by image generation on MNIST, FashionMNIST and CIFAR10

    Manifold Learning by Mixture Models of VAEs for Inverse Problems

    Full text link
    Representing a manifold of very high-dimensional data with generative models has been shown to be computationally efficient in practice. However, this requires that the data manifold admits a global parameterization. In order to represent manifolds of arbitrary topology, we propose to learn a mixture model of variational autoencoders. Here, every encoder-decoder pair represents one chart of a manifold. We propose a loss function for maximum likelihood estimation of the model weights and choose an architecture that provides us the analytical expression of the charts and of their inverses. Once the manifold is learned, we use it for solving inverse problems by minimizing a data fidelity term restricted to the learned manifold. To solve the arising minimization problem we propose a Riemannian gradient descent algorithm on the learned manifold. We demonstrate the performance of our method for low-dimensional toy examples as well as for deblurring and electrical impedance tomography on certain image manifolds

    PatchNR: Learning from Very Few Images by Patch Normalizing Flow Regularization

    Full text link
    Learning neural networks using only few available information is an important ongoing research topic with tremendous potential for applications. In this paper, we introduce a powerful regularizer for the variational modeling of inverse problems in imaging. Our regularizer, called patch normalizing flow regularizer (patchNR), involves a normalizing flow learned on small patches of very few images. In particular, the training is independent of the considered inverse problem such that the same regularizer can be applied for different forward operators acting on the same class of images. By investigating the distribution of patches versus those of the whole image class, we prove that our model is indeed a MAP approach. Numerical examples for low-dose and limited-angle computed tomography (CT) as well as superresolution of material images demonstrate that our method provides very high quality results. The training set consists of just six images for CT and one image for superresolution. Finally, we combine our patchNR with ideas from internal learning for performing superresolution of natural images directly from the low-resolution observation without knowledge of any high-resolution image

    PCA Reduced Gaussian Mixture Models with Applications in Superresolution

    Get PDF
    Despite the rapid development of computational hardware, the treatment of largeand high dimensional data sets is still a challenging problem. This paper providesa twofold contribution to the topic. First, we propose a Gaussian Mixture Model inconjunction with a reduction of the dimensionality of the data in each componentof the model by principal component analysis, called PCA-GMM. To learn the (lowdimensional) parameters of the mixture model we propose an EM algorithm whoseM-step requires the solution of constrained optimization problems. Fortunately,these constrained problems do not depend on the usually large number of samplesand can be solved efficiently by an (inertial) proximal alternating linearized mini-mization algorithm. Second, we apply our PCA-GMM for the superresolution of 2Dand 3D material images based on the approach of Sandeep and Jacob. Numericalresults confirm the moderate influence of the dimensionality reduction on the overallsuperresolution result.Super-résolution d'images multi-échelles en sciences des matériaux avec des attributs géométrique

    WPPNets and WPPFlows: The Power of Wasserstein Patch Priors for Superresolution

    Full text link
    Exploiting image patches instead of whole images have proved to be a powerful approach to tackle various problems in image processing. Recently, Wasserstein patch priors (WPP), which are based on the comparison of the patch distributions of the unknown image and a reference image, were successfully used as data-driven regularizers in the variational formulation of superresolution. However, for each input image, this approach requires the solution of a non-convex minimization problem which is computationally costly. In this paper, we propose to learn two kinds of neural networks in an unsupervised way based on WPP loss functions. First, we show how convolutional neural networks (CNNs) can be incorporated. Once the network, called WPPNet, is learned, it can very efficiently applied to any input image. Second, we incorporate conditional normalizing flows to provide a tool for uncertainty quantification. Numerical examples demonstrate the very good performance of WPPNets for superresolution in various image classes even if the forward operator is known only approximately
    corecore